NIC-Based Reduction in Myrinet Clusters: Is It Beneficial?
نویسندگان
چکیده
Reduction-to-one and reduction-to-all operations are common operations in parallel and distributed systems. These operations are collective operations which can involve many processes. It is therefore important to make these operations fast and efficient. Some modern network interface controllers (NICs) for system area networks (SANs) have programmable processors which can be used to offload protocol processing from the host processor. In this paper we investigate the use of the NIC processor to improve the performance of reduction operations. We implemented a NIC-based reduction-to-one operation which can perform integer and floating point operations, and evaluated our implementation. Our evaluation shows that the NIC-based operation performs better than the traditional host-based approach with up to a 1.19 factor of improvement. We also see that using NIC-based reduction can reduce host CPU utilization by a factor of improvement of 2.7, and can reduce the effects of process skew by a factor of improvement of up to 4.5.
منابع مشابه
Fast NIC-Based Barrier over Myrinet/GM
An efficient barrier implementation is desirable on parallel systems to obtain good parallel speedup and to support finer-grained computation. Some modern Network Interface Cards (NICs) have programmable processors which can be used to provide support for collective communications such as barrier. In this paper, we utilize such a programmable NIC to provide an efficient barrier synchronization ...
متن کاملHigh Performance and Reliable NIC-Based Multicast over Myrinet/GM-2
Multicast is an important collective operation for parallel programs. Some Network Interface Cards (NICs), such as Myrinet, have programmable processors that can be programmed to support multicast. This paper proposes a high performance and reliable NICbased multicast scheme, in which a NIC-based multisend mechanism is used to to send multiple replicas of a message to different destinations, an...
متن کاملExploring the Performance of the Myrinet Pc−cluster on Linux
Both the Infiniband and the virtual interface architecture (VIA) aim at providing effective cluster communication. However, the specification of the former does not define APIs. It contains an abstract description of the protocol verbs. The dependence of an implementation on the various features of the hardware, firmware, and software are not defined in the Infiniband architecture specification...
متن کاملAnalysis and Enhancement of Pipelining the Protocol Overheads for a High Throughput
This paper investigates the protocol overhead pipelining between the host and network interface card (NIC). Existing researches into the protocol overhead pipelining assume that protocol overheads in the host and NIC can be naturally pipelined. Our architecture-aware investigation, however, finds a new fact that the host and NIC compete against each other to access the host memory, system bus, ...
متن کاملDesign of System Area Network Interface Card Based on Intel IOP310
A design of system area network interface card (NIC) based on the Intel IOP310 I/O processor chipset is proposed in this paper. The chipset makes it powerful for the NIC to offload the processing of communication protocol from the host CPU. A network interface unit (NIU) based on memory bus is embedded in the NIC. The NIU not only thoroughly compensates for the lack of high performance data tra...
متن کامل